Goto

Collaborating Authors

 hand position




Manipulate-to-Navigate: Reinforcement Learning with Visual Affordances and Manipulability Priors

Zhang, Yuying, Pajarinen, Joni

arXiv.org Artificial Intelligence

Mobile manipulation in dynamic environments is challenging due to movable obstacles blocking the robot's path. Traditional methods, which treat navigation and manipulation as separate tasks, often fail in such 'manipulate-to-navigate' scenarios, as obstacles must be removed before navigation. In these cases, active interaction with the environment is required to clear obstacles while ensuring sufficient space for movement. To address the manipulate-to-navigate problem, we propose a reinforcement learning-based approach for learning manipulation actions that facilitate subsequent navigation. Our method combines manipulability priors to focus the robot on high manipulability body positions with affordance maps for selecting high-quality manipulation actions. By focusing on feasible and meaningful actions, our approach reduces unnecessary exploration and allows the robot to learn manipulation strategies more effectively. We present two new manipulate-to-navigate simulation tasks called Reach and Door with the Boston Dynamics Spot robot. The first task tests whether the robot can select a good hand position in the target area such that the robot base can move effectively forward while keeping the end effector position fixed. The second task requires the robot to move a door aside in order to clear the navigation path. Both of these tasks need first manipulation and then navigating the base forward. Results show that our method allows a robot to effectively interact with and traverse dynamic environments. Finally, we transfer the learned policy to a real Boston Dynamics Spot robot, which successfully performs the Reach task.


Self-Body Image Acquisition and Posture Generation with Redundancy using Musculoskeletal Humanoid Shoulder Complex for Object Manipulation

Koga, Yuya, Kawaharazuka, Kento, Toshimitsu, Yasunori, Nishiura, Manabu, Omura, Yusuke, Asano, Yuki, Okada, Kei, Kawasaki, Koji, Inaba, Masayuki

arXiv.org Artificial Intelligence

We proposed a method for learning the actual body image of a musculoskeletal humanoid for posture generation and object manipulation using inverse kinematics with redundancy in the shoulder complex. The effectiveness of this method was confirmed by realizing automobile steering wheel operation. The shoulder complex has a scapula that glides over the rib cage and an open spherical joint, and is supported by numerous muscle groups, enabling a wide range of motion. As a development of the human mimetic shoulder complex, we have increased the muscle redundancy by implementing deep muscles and stabilize the joint drive. As a posture generation method to utilize the joint redundancy of the shoulder complex, we consider inverse kinematics based on the scapular drive strategy suggested by the scapulohumeral rhythm of the human body. In order to control a complex robot imitating a human body, it is essential to learn its own body image, but it is difficult to know its own state accurately due to its deformation which is difficult to measure. To solve this problem, we developed a method to acquire a self-body image that can be updated appropriately by recognizing the hand position relative to an object for the purpose of object manipulation. We apply the above methods to a full-body musculoskeletal humanoid, Kengoro, and confirm its effectiveness by conducting an experiment to operate a car steering wheel, which requires the appropriate use of both arms.


Motion Prediction with Gaussian Processes for Safe Human-Robot Interaction in Virtual Environments

Mugisha, Stanley, Guda, Vamsi Krishna, Chevallereau, Christine, Chablat, Damien, Zoppi, Matteo

arXiv.org Artificial Intelligence

Humans use collaborative robots as tools for accomplishing various tasks. The interaction between humans and robots happens in tight shared workspaces. However, these machines must be safe to operate alongside humans to minimize the risk of accidental collisions. Ensuring safety imposes many constraints, such as reduced torque and velocity limits during operation, thus increasing the time to accomplish many tasks. However, for applications such as using collaborative robots as haptic interfaces with intermittent contacts for virtual reality applications, speed limitations result in poor user experiences. This research aims to improve the efficiency of a collaborative robot while improving the safety of the human user. We used Gaussian process models to predict human hand motion and developed strategies for human intention detection based on hand motion and gaze to improve the time for the robot and human security in a virtual environment. We then studied the effect of prediction. Results from comparisons show that the prediction models improved the robot time by 3\% and safety by 17\%. When used alongside gaze, prediction with Gaussian process models resulted in an improvement of the robot time by 2\% and the safety by 13\%.


Kinematically Constrained Human-like Bimanual Robot-to-Human Handovers

Göksu, Yasemin, Correia, Antonio De Almeida, Prasad, Vignesh, Kshirsagar, Alap, Koert, Dorothea, Peters, Jan, Chalvatzaki, Georgia

arXiv.org Artificial Intelligence

Bimanual handovers are crucial for transferring large, deformable or delicate objects. This paper proposes a framework for generating kinematically constrained human-like bimanual robot motions to ensure seamless and natural robot-to-human object handovers. We use a Hidden Semi-Markov Model (HSMM) to reactively generate suitable response trajectories for a robot based on the observed human partner's motion. The trajectories are adapted with task space constraints to ensure accurate handovers. Results from a pilot study show that our approach is perceived as more human--like compared to a baseline Inverse Kinematics approach.


The Scariest Thing About em M3gan /em

Slate

This weekend, I succumbed to the pull of all the meme-y marketing and went to the theater to see the surprise horror-comedy hit M3gan. I generally enjoyed it--the jokes are funny, the jump scares effective, the robot-centric plot a rather smart addition to our fresh new wave of artificial intelligence anxiety. It isn't the goriest or most frightening flick--the blood streams had to stay PG-13--but the steadily paced tension and the references to horror classics do their job fine. Yet, to me, the most chilling aspect of the movie doesn't come from anything you might expect: the offscreen murders, M3gan's deranged humanoid face, the pressures of capitalism. It actually stems from a deceptively insignificant 10-second scene that comes about halfway through the movie, in which the titular bot takes to the house piano. To be clear, I don't find this scene so viscerally terrifying for the piano tune itself (in the film, a solid instrumental cover of Martika's 1989 No. 1 hit "Toy Soldiers"), or for the overall menace of the moment, a turning point in M3gan's development.

  Industry:

Developing hierarchical anticipations via neural network-based event segmentation

Gumbsch, Christian, Adam, Maurits, Elsner, Birgit, Martius, Georg, Butz, Martin V.

arXiv.org Artificial Intelligence

Humans can make predictions on various time scales and hierarchical levels. Thereby, the learning of event encodings seems to play a crucial role. In this work we model the development of hierarchical predictions via autonomously learned latent event codes. We present a hierarchical recurrent neural network architecture, whose inductive learning biases foster the development of sparsely changing latent state that compress sensorimotor sequences. A higher level network learns to predict the situations in which the latent states tend to change. Using a simulated robotic manipulator, we demonstrate that the system (i) learns latent states that accurately reflect the event structure of the data, (ii) develops meaningful temporal abstract predictions on the higher level, and (iii) generates goal-anticipatory behavior similar to gaze behavior found in eye-tracking studies with infants. The architecture offers a step towards the autonomous learning of compressed hierarchical encodings of gathered experiences and the exploitation of these encodings to generate adaptive behavior.


Why Tesla's New "Yoke" Steering Wheel Could Be a Safety Problem

Slate

For once we can say that Tesla really has reinvented a wheel. For its newest Model S sedans and Model X SUVs, the carmaker dropped the traditional circular steering wheel in favor of what it's calling a "yoke." This yoke is rectangular and reminiscent of what you might see in a jet or a racecar. Tesla CEO Elon Musk indicated that the company made the change because, "Yet another round wheel is boring & blocks the screen," adding that Tesla's "Full Self-Driving" function--controversial due to safety concerns--"in panoramic mode looks way better with a yoke." Consumer Reports recently published a harsh review entirely focused on the Model S yoke, noting that the organization's test drivers found the steering apparatus to be hard to hold on to and awkward to maneuver.


Unsupervised state representation learning with robotic priors: a robustness benchmark

Lesort, Timothée, Seurin, Mathieu, Li, Xinrui, Díaz-Rodríguez, Natalia, Filliat, David

arXiv.org Artificial Intelligence

Our understanding of the world depends highly on our capacity to produce intuitive and simplified representations which can be easily used to solve problems. We reproduce this simplification process using a neural network to build a low dimensional state representation of the world from images acquired by a robot. As in Jonschkowski et al. 2015, we learn in an unsupervised way using prior knowledge about the world as loss functions called robotic priors and extend this approach to high dimension richer images to learn a 3D representation of the hand position of a robot from RGB images. We propose a quantitative evaluation of the learned representation using nearest neighbors in the state space that allows to assess its quality and show both the potential and limitations of robotic priors in realistic environments. We augment image size, add distractors and domain randomization, all crucial components to achieve transfer learning to real robots. Finally, we also contribute a new prior to improve the robustness of the representation. The applications of such low dimensional state representation range from easing reinforcement learning (RL) and knowledge transfer across tasks, to facilitating learning from raw data with more efficient and compact high level representations. The results show that the robotic prior approach is able to extract high level representation as the 3D position of an arm and organize it into a compact and coherent space of states in a challenging dataset.